Goto

Collaborating Authors

 scoping review


Understanding AI Trustworthiness: A Scoping Review of AIES & FAccT Articles

Mehrotra, Siddharth, Huang, Jin, Fu, Xuelong, Dobbe, Roel, Sánchez, Clara I., de Rijke, Maarten

arXiv.org Artificial Intelligence

Background: Trustworthy AI serves as a foundational pillar for two major AI ethics conferences: AIES and FAccT. However, current research often adopts techno-centric approaches, focusing primarily on technical attributes such as reliability, robustness, and fairness, while overlooking the sociotechnical dimensions critical to understanding AI trustworthiness in real-world contexts. Objectives: This scoping review aims to examine how the AIES and FAccT communities conceptualize, measure, and validate AI trustworthiness, identifying major gaps and opportunities for advancing a holistic understanding of trustworthy AI systems. Methods: We conduct a scoping review of AIES and FAccT conference proceedings to date, systematically analyzing how trustworthiness is defined, operationalized, and applied across different research domains. Our analysis focuses on conceptualization approaches, measurement methods, verification and validation techniques, application areas, and underlying values. Results: While significant progress has been made in defining technical attributes such as transparency, accountability, and robustness, our findings reveal critical gaps. Current research often predominantly emphasizes technical precision at the expense of social and ethical considerations. The sociotechnical nature of AI systems remains less explored and trustworthiness emerges as a contested concept shaped by those with the power to define it. Conclusions: An interdisciplinary approach combining technical rigor with social, cultural, and institutional considerations is essential for advancing trustworthy AI. We propose actionable measures for the AI ethics community to adopt holistic frameworks that genuinely address the complex interplay between AI systems and society, ultimately promoting responsible technological development that benefits all stakeholders.


A Scoping Review of Machine Learning Applications in Power System Protection and Disturbance Management

Oelhaf, Julian, Kordowich, Georg, Pashaei, Mehran, Bergler, Christian, Maier, Andreas, Jäger, Johann, Bayer, Siming

arXiv.org Artificial Intelligence

The integration of renewable and distributed energy resources reshapes modern power systems, challenging conventional protection schemes. This scoping review synthesizes recent literature on machine learning (ML) applications in power system protection and disturbance management, following the PRISMA for Scoping Reviews framework. Based on over 100 publications, three key objectives are addressed: (i) assessing the scope of ML research in protection tasks; (ii) evaluating ML performance across diverse operational scenarios; and (iii) identifying methods suitable for evolving grid conditions. ML models often demonstrate high accuracy on simulated datasets; however, their performance under real-world conditions remains insufficiently validated. The existing literature is fragmented, with inconsistencies in methodological rigor, dataset quality, and evaluation metrics. This lack of standardization hampers the comparability of results and limits the generalizability of findings. To address these challenges, this review introduces a ML-oriented taxonomy for protection tasks, resolves key terminological inconsistencies, and advocates for standardized reporting practices. It further provides guidelines for comprehensive dataset documentation, methodological transparency, and consistent evaluation protocols, aiming to improve reproducibility and enhance the practical relevance of research outcomes. Critical gaps remain, including the scarcity of real-world validation, insufficient robustness testing, and limited consideration of deployment feasibility. Future research should prioritize public benchmark datasets, realistic validation methods, and advanced ML architectures. These steps are essential to move ML-based protection from theoretical promise to practical deployment in increasingly dynamic and decentralized power systems.


Scoping Review of Active Learning Strategies and their Evaluation Environments for Entity Recognition Tasks

Kohl, Philipp, Krämer, Yoka, Fohry, Claudia, Kraft, Bodo

arXiv.org Artificial Intelligence

We conducted a scoping review for active learning in the domain of natural language processing (NLP), which we summarize in accordance with the PRISMA-ScR guidelines as follows: Objective: Identify active learning strategies that were proposed for entity recognition and their evaluation environments (datasets, metrics, hardware, execution time). Design: We used Scopus and ACM as our search engines. We compared the results with two literature surveys to assess the search quality. We included peer-reviewed English publications introducing or comparing active learning strategies for entity recognition. Results: We analyzed 62 relevant papers and identified 106 active learning strategies. We grouped them into three categories: exploitation-based (60x), exploration-based (14x), and hybrid strategies (32x). We found that all studies used the F1-score as an evaluation metric. Information about hardware (6x) and execution time (13x) was only occasionally included. The 62 papers used 57 different datasets to evaluate their respective strategies. Most datasets contained newspaper articles or biomedical/medical data. Our analysis revealed that 26 out of 57 datasets are publicly accessible. Conclusion: Numerous active learning strategies have been identified, along with significant open questions that still need to be addressed. Researchers and practitioners face difficulties when making data-driven decisions about which active learning strategy to adopt. Conducting comprehensive empirical comparisons using the evaluation environment proposed in this study could help establish best practices in the domain.


A Scoping Review of Energy Load Disaggregation

Tolnai, Balázs András, Ma, Zheng, Jørgensen, Bo Nørregaard

arXiv.org Artificial Intelligence

Energy load disaggregation can contribute to balancing power grids by enhancing the effectiveness of demand-side management and promoting electricity-saving behavior through increased consumer awareness. However, the field currently lacks a comprehensive overview. To address this gap, this paper con-ducts a scoping review of load disaggregation domains, data types, and methods, by assessing 72 full-text journal articles. The findings reveal that domestic electricity consumption is the most researched area, while others, such as industrial load disaggregation, are rarely discussed. The majority of research uses relatively low-frequency data, sampled between 1 and 60 seconds. A wide variety of methods are used, and artificial neural networks are the most common, followed by optimization strategies, Hidden Markov Models, and Graph Signal Processing approaches.


Question Answering for Electronic Health Records: A Scoping Review of datasets and models

Bardhan, Jayetri, Roberts, Kirk, Wang, Daisy Zhe

arXiv.org Artificial Intelligence

Question Answering (QA) systems on patient-related data can assist both clinicians and patients. They can, for example, assist clinicians in decision-making and enable patients to have a better understanding of their medical history. Significant amounts of patient data are stored in Electronic Health Records (EHRs), making EHR QA an important research area. In EHR QA, the answer is obtained from the medical record of the patient. Because of the differences in data format and modality, this differs greatly from other medical QA tasks that employ medical websites or scientific papers to retrieve answers, making it critical to research EHR question answering. This study aimed to provide a methodological review of existing works on QA over EHRs. We searched for articles from January 1st, 2005 to September 30th, 2023 in four digital sources including Google Scholar, ACL Anthology, ACM Digital Library, and PubMed to collect relevant publications on EHR QA. 4111 papers were identified for our study, and after screening based on our inclusion criteria, we obtained a total of 47 papers for further study. Out of the 47 papers, 25 papers were about EHR QA datasets, and 37 papers were about EHR QA models. It was observed that QA on EHRs is relatively new and unexplored. Most of the works are fairly recent. Also, it was observed that emrQA is by far the most popular EHR QA dataset, both in terms of citations and usage in other papers. Furthermore, we identified the different models used in EHR QA along with the evaluation metrics used for these models.


Robot's Gendering Trouble: A Scoping Review of Gendering Humanoid Robots and its Effects on HRI

Perugia, Giulia, Lisy, Dominika

arXiv.org Artificial Intelligence

The discussion around the problematic practice of gendering humanoid robots has risen to the foreground in the last few years. To lay the basis for a thorough understanding of how robot's "gender" has been understood within the Human-Robot Interaction (HRI) community - i.e., how it has been manipulated, in which contexts, and which effects it has yield on people's perceptions and interactions with robots - we performed a scoping review of the literature. We identified 553 papers relevant for our review retrieved from 5 different databases. The final sample of reviewed papers included 35 papers written between 2005 and 2021, which involved a total of 3902 participants. In this article, we thoroughly summarize these papers by reporting information about their objectives and assumptions on gender (i.e., definitions and reasons to manipulate gender), their manipulation of robot's "gender" (i.e., gender cues and manipulation checks), their experimental designs (e.g., demographics of participants, employed robots), and their results (i.e., main and interaction effects). The review reveals that robot's "gender" does not affect crucial constructs for the HRI, such as likability and acceptance, but rather bears its strongest effect on stereotyping. We leverage our different epistemological backgrounds in Social Robotics and Gender Studies to provide a comprehensive interdisciplinary perspective on the results of the review and suggest ways to move forward in the field of HRI.


Transfer Learning Approaches for Neuroimaging Analysis: A Scoping Review

#artificialintelligence

Deep learning algorithms have been moderately successful in diagnoses of diseases by analyzing medical images especially through neuroimaging that is rich in annotated data. Transfer learning methods have demonstrated strong performance in tackling annotated data. It utilizes and transfers knowledge learned from a source domain to target domain even when the dataset is small. There are multiple approaches to transfer learning that result in a range of performance estimates in diagnosis, detection, and classification of clinical problems. Therefore, in this paper, we reviewed transfer learning approaches, their design attributes, and their applications to neuroimaging problems. We reviewed two main literature databases and included the most relevant studies using predefined inclusion criteria. Among 50 reviewed studies, more than half of them are on transfer learning for Alzheimer's disease. Brain mapping and brain tumor detection were second and third most discussed research problems, respectively. The most common source dataset for transfer learning was ImageNet, which is not a neuroimaging dataset. This suggests that the majority of studies preferred pre-trained models instead of training their own model on a neuroimaging dataset. Although, about one third of studies designed their own architecture, most studies used existing Convolutional Neural Network architectures. Magnetic Resonance Imaging was the most common imaging modality. In almost all studies, transfer learni...


Research on Artificial Intelligence and Primary Care: A Scoping Review

#artificialintelligence

Objective: The purpose of this study was to assess the nature and extent of the body of research on artificial intelligence (AI) and primary care. Methods: We performed a scoping review, searching 11 published and grey literature databases with subject headings and key words pertaining to the concepts of 1) AI and 2) primary care: MEDLINE, EMBASE, Cinahl, Cochrane Library, Web of Science, Scopus, IEEE Xplore, ACM Digital Library, MathSciNet, AAAI, arXiv. Screening included title and abstract and then full text stages. Final inclusion criteria: 1) research study of any design, 2) developed or used AI, 3) used primary care data and/or study conducted in a primary care setting and/or explicit mention of study applicability to primary care; exclusion criteria: 1) narrative, editorial, or textbook chapter, 2) not applicable to primary care population or settings, 3) full text inaccessible in the English Language. We extracted and summarized seven key characteristics of included studies: overall study purpose(s), author appointments, primary care functions, author intended target end user(s), target health condition(s), location of data source(s) (if any), subfield(s) of AI.